-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix QC test, fix bug in history time axis, fix history output averaging for timestep output #624
Conversation
- Fix bug in history time axis when sec_init is not zero. - Fix issue with time_beg and time_end uninitialized values. - Add support for averaging with histfreq='1' by allowing histfreq_n to be any value in that case. Extend and clean up construct_filename for history files. More could be done, but wanted to preserve backwards compatibility. - Add new calendar_sec2hms to converts daily seconds to hh:mm:ss. Update the calchk calendar unit tester to check this method - Remove abort test in bcstchk, this was just causing problems in regression testing - Remove known problems documentation about problems writing when istep=1. This issue does not exist anymore with the updated time manager. - Add new tests with hist_avg = false. Add set_nml.histinst.
- Add netcdf ststus checks and aborts in ice_read_write.F90 - Check for end of file when reading records in ice_read_write.F90 for ice_read_nc methods - Update set_nml.qc to better specify the test, turn off leap years since we're cycling 2005 data - Add check in c ice.t-test.py to make sure there is at least 1825 files, 5 years of data - Add QC run to base_suite.ts to verify qc runs to completion and possibility to use those results directly for QC validation - Clean up error messages and some indentation in ice_read_write.F90
- Add prod suite including 10 year gx1prod and qc test - Update unit test compare scripts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to go. I just had a few questions about the averaging. Doesn't have to be added here.
time_beg(ns) = real(time_beg(ns),kind=real_kind) | ||
endif | ||
endif | ||
if (avgct(ns) == c1) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this just initializes timedbl and time_beg at the beginning of the averaging interval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct. and timedbl is just a temporary local to make going from double to single a little cleaner. time_beg is now initialized whenever hist_avg is on because we should assume it might be written to the history file when hist_avg is on, even for histfreq='1' and histfreq_n=1. This moves away from treating histfreq='1' and histfreq_n=1 as special cases which allows for averaging across timesteps (histfreq='1') and allows a user to cleanly set hist_avg to true even if histfreq_n=1.
@@ -1814,7 +1815,7 @@ subroutine accum_hist (dt) | |||
n4Dfcum = n4Dscum + num_avail_hist_fields_4Df ! should equal num_avail_hist_fields_tot | |||
|
|||
do ns = 1,nstreams | |||
if (.not. hist_avg .or. histfreq(ns) == '1') then ! write snapshots | |||
if (.not. hist_avg) then ! write snapshots |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point, I would like hist_avg to be an array of size nstreams. Then we can control the instantaneous versus average on a stream by stream basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, I will make sure that's part of an issue.
I just merged master into this branch and testing looks good, https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#83068c7a3802948d3d89e36b6c871aa155afb684. The couple failures are expected and either cases still running or 10 year gx1prod tests I just added that will fail until we fix the leap year / cycling issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Tony, thanks a lot for putting in these fixes. I left a few comments/questions below, but in general this looks good.
I had started to fix the hist_avg
with histfreq=1
situation myself, and I checked your code against mine just to see if we were doing the same thing. What you propose is more complete than what I had managed to code up (I was dragged into other things). If you are interested, my branch is here: master...phil-blain:hist-averaging-multiple-dt.
Have a good week-end! :)
endif | ||
if (avgct(ns) == c1) then | ||
timedbl = (timesecs-dt)/(secday) | ||
time_beg(ns) = real(timedbl,kind=real_kind) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do without timedbl
and just do this, no ?
time_beg(ns) = real((timesecs-dt)/(secday), kind=real_kind)
Is there any reason that I'm missing why we need a separate variable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We absolutely could implement what you propose.
if (histfreq(ns) == '1' .and. histfreq_n(ns) == 1) then ! timestep | ||
write(ncfile,'(a,a,i4.4,a,i2.2,a,i2.2,a,i5.5,a,a)') & | ||
history_file(1:lenstr(history_file))//trim(cstream),'_inst.', & | ||
iyear,'-',imonth,'-',iday,'-',msec,'.',trim(suffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why we need a separate if
branch for this first case. Is it not covered by the
else ! instantaneous
line at the end of the subroutine ?
I guess the idea is that someone might have hist_avg = true and histfreq =1 and histfreq_n = 1, and in this case we ignore hist_avg (since there is nothing to average) and write instantaneous output instead ?
I'm wondering if it might be more useful to catch that early in ice_init
and reset hist_avg to false in that case.. That would simplify the code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With multiple history streams and a single hist_avg variable, we do not want to change hist_avg to false if one of the streams has histfreq='1' and histfreq_n=1. A future goal would be to make hist_avg an array as Dave suggested then we have more options.
@@ -1,5 +1,5 @@ | |||
histfreq = 'm','d','1','h','x' | |||
histfreq_n = 1,2,6,4,1 | |||
histfreq_n = 1,2,6,4,5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the rationale here ? 'x' already sets this stream to off, no ? we would want 'y'
somewhere in histfreq
to test all 5 streams, but we would need a multi-year test...
cstream = '' | ||
!echmod ! this was implemented for CESM but it breaks post-processing software | ||
!echmod ! of other groups (including RASM which uses CESMCOUPLED) | ||
!echmod if (ns > 1) write(cstream,'(i1.1)') ns-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines (setting an empty cstream
, the following comment and the commented line if (ns > 1) write(cstream,'(i1.1)') ns-1
) dates all the way back to our initial commit 7d740de (initial commit of CICE without icepack to git from LANL consortium branch r1229 May 24, 2017, 2017-05-24). Maybe it would be time to just remove cstream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about removing the cstream but they aren't hurting anyone and I wasn't sure whether maybe some coupled system was using them, so I left them in. We could/should think about removing it, but we'd want to ask outside users whether they are using it.
if (hist_avg) then | ||
if (histfreq(ns) == '1' .or. histfreq(ns) == 'h'.or.histfreq(ns) == 'H') then | ||
! do nothing | ||
elseif (new_year) then | ||
iyear = iyear - 1 | ||
imonth = 12 | ||
iday = daymo(imonth) | ||
elseif (new_month) then | ||
imonth = mmonth - 1 | ||
iday = daymo(imonth) | ||
elseif (new_day) then | ||
iday = iday - 1 | ||
endif | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I looked at this code in the context of #551, I did not readily understand why we have to do those substractions here, and why we do not have to do any if histfreq = 1 or h...
Is it because the file names for averaged data are dated at the start of the averaging period ? I think a code comment explaining that would not hurt :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all part of a rather complex interaction between dates and history frequencies with a certain amount of assumption built-in. Basically, if you are doing annual, monthly, or daily averages, the history will be written on the "new_flag". At that point, you're in the next year, month, or day. So you subtract "1" so when you construct the filename, the name is associated with the year, month, or day being averaged. Of course, all this comes unglued if the histfreq_n is not 1 in terms of the filename. The instantaneous stuff and histfreq='1' just writes out a file with the current date in it because there isn't a general way to create a meaningful filename. Most of what I did tried to preserve what's there and expand how it works. There are many things that could be further improved.
@@ -4,7 +4,6 @@ smoke gx3 1x1 debug,diag1,run2day | |||
smoke gx3 1x4 debug,diag1,run2day | |||
smoke gx3 4x1 debug,diag1,run5day | |||
restart gx3 8x2 debug | |||
smoke gx1 64x1 qc,medium |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, seems you changed your mind :)
@@ -0,0 +1,3 @@ | |||
# Test Grid PEs Sets BFB-compare | |||
smoke gx1 64x1 qc,medium | |||
smoke gx1 64x2 gx1prod,long,run10year |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a missing newline at EOF here
! yearmax = 1000 | ||
yearmax = 100000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here again, this could be folded into "Fix QC issues"
@@ -9,7 +9,7 @@ smoke gx3 7x2 diag1,bigdiag,run1day,diagpt1 | |||
decomp gx3 4x2x25x29x5 none | |||
smoke gx3 4x2 diag1,run5day smoke_gx3_8x2_diag1_run5day | |||
smoke gx3 4x1 diag1,run5day,thread smoke_gx3_8x2_diag1_run5day | |||
smoke gx3 1x8 diag1,run5day,evp1d smoke_gx3_8x2_diag1_run5day | |||
smoke gx3 1x8 diag1,run5day,evp1d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the justification for removing the b4b compare with smoke_gx3_8x2_diag1_run5day
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it fails and is expected to fail. We would have to turn on the debug flag for the baseline and test to have a chance of bit-for-bit and that would not happen for all compilers. After watching it fail a few times, I decided to just remove it. We could/should consider adding a debug version of the test though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. I've confirmed that it fixes our issue of the time-axis being set correctly when secs_init is not 0.
* ice_history_write: fix initial condition metadata under 'hist_avg' When writing averaged history outputs (hist_avg=.true.), this setting also affects the initial condition. Even if the actual data variables written to the initial condition are not averaged (they are taken more or less directly from the restart or the hard-coded defaults, modulo aggregation over categories), their attributes ('cell_method' and 'time_rep') imply they are averaged, and the 'bound' attribute of the 'time' variable refers to the 'time_bounds' variable. Make the metadata of the initial condition more correct by: - not writing the 'time_bounds' variable (and the corresponding 'd2' dimension) - not writing the 'bounds' attribute of the 'time' variable - not writing the 'cell_method' attributes of each variable - writing the 'time_rep' attribute of each variable as 'instantaneous' instead of 'averaged'. Do this by checking 'write_ic' at all places where we check for the value of 'hist_avg' to write the above variables and attributes in each of the 3 IO backends (binary, netcdf, pio2). * drivers/{nemo_concepts,standalone}: write initial condition at initial time In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep before writing the initial condition, such that the 'time' variable in the initial condition is not zero; it has a value of 1*dt (the model time step). The initial condition filename also reflects this, since 'msec' (model seconds) also has a value of 1*dt and is used in ice_history_shared::construct_filename. This leads to the initial condition filename not corresponding to the model initialization date/time but rather 1*dt later. Since we call 'accum_hist' after initializing the forcing, any forcing field written to the initial condition has values corresponding to msec=dt, whereas the ice state corresponds to msec=0, leading to an inconsistency. Fix that by calling 'accum_hist' to write the initial condition _before_ calling 'advance_timestep'. Since we now call 'accum_hist' before initializing the forcing, any forcing field written to the initial condition have its default, hard-coded value, instead of its value at time=dt. An improvement would be to read the forcing at time=dt, write the initial condition, advance the time step, and read the forcing again, but let's not complicate things too much for now. (cherry picked from commit 9a55ad9) Cherry-pick notes: There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due to 26d917a (Fix QC test, fix bug in history time axis, fix history output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that commit enabled averaging over multiple time steps and thus removed the assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had to add additional changes to 'ice_history_write' in the conditions that are checked before writing the NetCDF attributes since we do not yet have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715), 2022-05-10). Closes: https://gitlab.science.gc.ca/concepts/CICE/issues/19
* ice_history_write: fix initial condition metadata under 'hist_avg' When writing averaged history outputs (hist_avg=.true.), this setting also affects the initial condition. Even if the actual data variables written to the initial condition are not averaged (they are taken more or less directly from the restart or the hard-coded defaults, modulo aggregation over categories), their attributes ('cell_method' and 'time_rep') imply they are averaged, and the 'bound' attribute of the 'time' variable refers to the 'time_bounds' variable. Make the metadata of the initial condition more correct by: - not writing the 'time_bounds' variable (and the corresponding 'd2' dimension) - not writing the 'bounds' attribute of the 'time' variable - not writing the 'cell_method' attributes of each variable - writing the 'time_rep' attribute of each variable as 'instantaneous' instead of 'averaged'. Do this by checking 'write_ic' at all places where we check for the value of 'hist_avg' to write the above variables and attributes in each of the 3 IO backends (binary, netcdf, pio2). * drivers/{nemo_concepts,standalone}: write initial condition at initial time In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep before writing the initial condition, such that the 'time' variable in the initial condition is not zero; it has a value of 1*dt (the model time step). The initial condition filename also reflects this, since 'msec' (model seconds) also has a value of 1*dt and is used in ice_history_shared::construct_filename. This leads to the initial condition filename not corresponding to the model initialization date/time but rather 1*dt later. Since we call 'accum_hist' after initializing the forcing, any forcing field written to the initial condition has values corresponding to msec=dt, whereas the ice state corresponds to msec=0, leading to an inconsistency. Fix that by calling 'accum_hist' to write the initial condition _before_ calling 'advance_timestep'. Since we now call 'accum_hist' before initializing the forcing, any forcing field written to the initial condition have its default, hard-coded value, instead of its value at time=dt. An improvement would be to read the forcing at time=dt, write the initial condition, advance the time step, and read the forcing again, but let's not complicate things too much for now. (cherry picked from commit 9a55ad9) Cherry-pick notes: There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due to 26d917a (Fix QC test, fix bug in history time axis, fix history output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that commit enabled averaging over multiple time steps and thus removed the assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had to add additional changes to 'ice_history_write' in the conditions that are checked before writing the NetCDF attributes since we do not yet have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715), 2022-05-10).
* ice_history_write: fix initial condition metadata under 'hist_avg' When writing averaged history outputs (hist_avg=.true.), this setting also affects the initial condition. Even if the actual data variables written to the initial condition are not averaged (they are taken more or less directly from the restart or the hard-coded defaults, modulo aggregation over categories), their attributes ('cell_method' and 'time_rep') imply they are averaged, and the 'bound' attribute of the 'time' variable refers to the 'time_bounds' variable. Make the metadata of the initial condition more correct by: - not writing the 'time_bounds' variable (and the corresponding 'd2' dimension) - not writing the 'bounds' attribute of the 'time' variable - not writing the 'cell_method' attributes of each variable - writing the 'time_rep' attribute of each variable as 'instantaneous' instead of 'averaged'. Do this by checking 'write_ic' at all places where we check for the value of 'hist_avg' to write the above variables and attributes in each of the 3 IO backends (binary, netcdf, pio2). * drivers/{nemo_concepts,standalone}: write initial condition at initial time In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep before writing the initial condition, such that the 'time' variable in the initial condition is not zero; it has a value of 1*dt (the model time step). The initial condition filename also reflects this, since 'msec' (model seconds) also has a value of 1*dt and is used in ice_history_shared::construct_filename. This leads to the initial condition filename not corresponding to the model initialization date/time but rather 1*dt later. Since we call 'accum_hist' after initializing the forcing, any forcing field written to the initial condition has values corresponding to msec=dt, whereas the ice state corresponds to msec=0, leading to an inconsistency. Fix that by calling 'accum_hist' to write the initial condition _before_ calling 'advance_timestep'. Since we now call 'accum_hist' before initializing the forcing, any forcing field written to the initial condition have its default, hard-coded value, instead of its value at time=dt. An improvement would be to read the forcing at time=dt, write the initial condition, advance the time step, and read the forcing again, but let's not complicate things too much for now. (cherry picked from commit 9a55ad9) Cherry-pick notes: There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due to 26d917a (Fix QC test, fix bug in history time axis, fix history output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that commit enabled averaging over multiple time steps and thus removed the assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had to add additional changes to 'ice_history_write' in the conditions that are checked before writing the NetCDF attributes since we do not yet have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715), 2022-05-10).
* ice_history_write: fix initial condition metadata under 'hist_avg' When writing averaged history outputs (hist_avg=.true.), this setting also affects the initial condition. Even if the actual data variables written to the initial condition are not averaged (they are taken more or less directly from the restart or the hard-coded defaults, modulo aggregation over categories), their attributes ('cell_method' and 'time_rep') imply they are averaged, and the 'bound' attribute of the 'time' variable refers to the 'time_bounds' variable. Make the metadata of the initial condition more correct by: - not writing the 'time_bounds' variable (and the corresponding 'd2' dimension) - not writing the 'bounds' attribute of the 'time' variable - not writing the 'cell_method' attributes of each variable - writing the 'time_rep' attribute of each variable as 'instantaneous' instead of 'averaged'. Do this by checking 'write_ic' at all places where we check for the value of 'hist_avg' to write the above variables and attributes in each of the 3 IO backends (binary, netcdf, pio2). * drivers/{nemo_concepts,standalone}: write initial condition at initial time In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep before writing the initial condition, such that the 'time' variable in the initial condition is not zero; it has a value of 1*dt (the model time step). The initial condition filename also reflects this, since 'msec' (model seconds) also has a value of 1*dt and is used in ice_history_shared::construct_filename. This leads to the initial condition filename not corresponding to the model initialization date/time but rather 1*dt later. Since we call 'accum_hist' after initializing the forcing, any forcing field written to the initial condition has values corresponding to msec=dt, whereas the ice state corresponds to msec=0, leading to an inconsistency. Fix that by calling 'accum_hist' to write the initial condition _before_ calling 'advance_timestep'. Since we now call 'accum_hist' before initializing the forcing, any forcing field written to the initial condition have its default, hard-coded value, instead of its value at time=dt. An improvement would be to read the forcing at time=dt, write the initial condition, advance the time step, and read the forcing again, but let's not complicate things too much for now. (cherry picked from commit 9a55ad9) Cherry-pick notes: There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due to 26d917a (Fix QC test, fix bug in history time axis, fix history output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that commit enabled averaging over multiple time steps and thus removed the assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had to add additional changes to 'ice_history_write' in the conditions that are checked before writing the NetCDF attributes since we do not yet have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715), 2022-05-10). Rebase onto CICE6.4.1 notes: There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due to the same commits mentioned above. They are both included in CICE6.4.1, but the original cherry-picked commit (9a55ad9 (cicecore: correct initial condition metadata (CICE-Consortium#818), 2023-03-13)) is not. The correct resolution was thus to: - keep ours version for calls to ice_write_hist_attrs - cherry-pick from 9a55ad9 the changes to ice_write_hist_attrs - in io_pio2, change 'if (hist_avg)' to 'if (hist_avg .and. .not. write_ic)'
PR checklist
Fix QC test, fix bug in history time axis, fix history output averaging for timestep output
apcraig
Tests pass bit-for-bit on cheyenne, https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#7b5c2b404addb356c75d87f7cb4ae2fa59bfc20b
Fix history features
Fix QC issues
Fix unit tests and testing